2024 Mar 7th – UQ PUG 6
Welcome to UQ Python User Group! Check out our general information for details about who we are and what we do.
Overview
Structure
- We start today by adding our names to the table below
- Add your questions to this page
- This month’s presentation
- Finally, we spend the rest of the session answering the questions you’ve brought!
Mailing list
If you would like to be on the mailing list and receive the latest PUG updates, please sign up here:
https://forms.office.com/r/6qvfFX0qGr
Feel free to send this link to anyone you think may benefit.
Training Resources
We offer Python training sessions and resources, you can find our introductory guide here.
Today’s Presentation
Today we looked at the popular plotting package matplotlib
. You can find the Jupyter notebook we worked from on our GitHub page.
Introduce yourself
What’s your name? | Where are you from? | Why are you here? |
---|---|---|
Nick Wiggins | UQ Library | Here to help (and learn!) |
Cameron West | UQ Library / SMP | To learn and help |
Dennis | SMP | To learn and get help |
Karen Fang | UQ BEL | To learn |
Valentina | UQ Library | Here to say hello! |
Theophilus Mensah | QAAFI | here to learn |
Y Allo | SMP | Learn this sofware |
Senn Oon | UQ Library | Placement student observing! |
Jay | UQ | to learn python |
Questions
If you have any Python questions you’d like to explore with the group, please put your question and name in the sections below.
If you think you can help, feel free to contribute to the answers section!
Question 1 - Tuple or List use - Nida
when to use tuple and when to use list?
## when we create a table using pandas DataFrame, we can create a tuple like below:
= pd.DataFrame(columns=("number", "name", "major"))
df
## or a list like below:
= pd.DataFrame(columns=["number", "name", "major"])
df
## so in what occassion are we using tuple & in what occassion are we using list? Thanks so much
Answers
- When you need to ensure that the elements shouldn’t change. Tuples cannot be edited (immutable), you can’t change their elements, so only when this is necessary.
- They can be faster, since they are a constant size – according to StackOverflow
Question 2 - matplotlib - Nida
I have tried matplotlib for making a single line graph, but I still don’t know how to use it for working with multiple data as follows
# I have these Optical Density in 750nm data that show the growth curve for each species
= [["Species","8 Dec 2023","15 Dec 2023", "18 Dec 2023"],
OD750_data "Nannochloropsis", 0.2615, 1.0385, 0.822],
["Phaeodactylum", 0.208, 0.603, 0.499],
["Tisochrysis", 0.1015, 0.135, 0.1265]]
[= tabulate(OD750_data)
table1 print(table1)
= tabulate(OD750_data,headers='firstrow')
table2 print(table2)
# how do I turn this table into line graphs with 3 growth curves for 3 species?
Answers
- Great question Nida. Some of the difficulty lies in the data shape, the easiest way to use data for these things is in long form. I’ve used a few steps below
- Turn it into a
pandas
dataframe - Transform it into three columns and nine rows with
pd.melt
(long form). Runprint(OD750_data)
to see this - Plot each line by subsetting
OD750_data["Species"] ==
. ```python import pandas as pd
- Turn it into a
I have these Optical Density in 750nm data that show the growth curve for each species
OD750_data = [[“Species”,8,15, 18], [“Nannochloropsis”, 0.2615, 1.0385, 0.822], [“Phaeodactylum”, 0.208, 0.603, 0.499], [“Tisochrysis”, 0.1015, 0.135, 0.1265]]
OD750_data = pd.DataFrame(columns = OD750_data[0], data = OD750_data[1:])
OD750_data = pd.melt(OD750_data, “Species”, var_name = “time”)
plt.plot(“time”, “value”, data = OD750_data[OD750_data[“Species”] == “Nannochloropsis”]) plt.plot(“time”, “value”, data = OD750_data[OD750_data[“Species”] == “Phaeodactylum”]) plt.plot(“time”, “value”, data = OD750_data[OD750_data[“Species”] == “Tisochrysis”])
plt.show()
### Question 3 - Question - Dennis
Reserve[:,n+1] = Reserve[:,n].reshape(M,1) + (P - S).reshape(M,1)
ValueError: could not broadcast input array from shape (1000,1) into shape (1000,)
Comment: I am pretty sure the broadcast dimension matches. I just dont understand this ".reshape" thing.
```python
## Code for Q3
import numpy as np
#%%
def total_claim_amount(paths, lambda_parameter, mu_parameter, step_size):
#Simulate the number of claims (N): Poisson distribution
num_claims = np.random.poisson(lam = lambda_parameter*step_size,
size = paths)
dime = np.max(num_claims)
#Simulate claim sizes (X): Exponential distribution
# claim_sizes = np.random.exponential(scale = mu_parameter,
# size = (paths, dime))
# claim_sizes = np.random.pareto(a = mu_parameter,
# size = (paths, dime))
claim_sizes = np.random.weibull(a = mu_parameter,
size = (paths, dime))
#check to create True (1) and False (0)
#So, only the correct number of claims are counted
check = np.arange(dime) < num_claims[:, None]
#Calculate total claim amount for all paths
#sum across the rows: axis = 1
#total was a row vector: reshaped into column vector
total = np.sum(claim_sizes * check, axis=1).reshape(paths, 1)
#Calculate mean and standard deviation of total claim amounts
mean_total = np.mean(total)
std_total = np.std(total)
# print("Mean total claim amount:", mean_total)
# print("Standard deviation of total claim amount:", std_total)
return total, mean_total, std_total
def premium(lambda_parameter, mu_parameter, step_size):
p = lambda_parameter * mu_parameter * step_size
return p
#%% Parameters
M = 10**3 #Number of paths
N = 360 #Number of time steps (days)
lambda_parameter = 12 #Poisson distribution parameter for claim number process
#1 claim month
mu_parameter = 1 #Exponential distribution parameter for claim size process
#mean claim size is $1
Reserve = np.zeros((M,N))
Reserve[:,0] = 100 * mu_parameter * np.ones((M,))
#Compute premium income
P = premium(lambda_parameter, mu_parameter, 1/N) * np.ones((M,1))
#%%
for n in range(N - 1):
#Compute total claim amount process
S,_,_ = total_claim_amount(M, lambda_parameter, mu_parameter, 1/N)
#Compute reserve for the next time step
Reserve[:,n+1] = Reserve[:,n].reshape(M,1) + (P - S).reshape(M,1)
Answers
Hi Dennis, might need a few more details about what you need here, but I can tell you why the reshaping isn’t working. Essentially, the .reshape method is trying to take your column
Reserve[:,n]
and turn it into a row. It works! RunReserve[:,n].reshape(M,1) + (P - S).reshape(M,1)
by itself to see. The problem is that your new row can’t be assigned toReserve[:,n+1]
because that is a column ofReserve
. Do you need to reshape at all here?In context, every column in Reserve matrix is a time step (eg: t=1, t=2 etc); and every row is just a different trial (trial 1, 2, 3 …). What I want to achieve is using the column t=n value to calculate column t=n+1 value. This is why I used a loop. And my intend to reshape is because I am making sure the dimensions of Left and Right are matching.
This is why I am confused by the error message because I essentially have the correct dimension, but somehow I cant broadcast it correctly??? - Dennis
Question 4 - Example of Conditionals - Nida
Add more details here
## Code for Q4
Answers
# What are conditionals?
# 'if' statements. These are the control blocks for your code, they're logical filters.
a = 10
b = 20
print(a>b)
# Checks if a > b
if a > b:
print("a is greater than b!")
print("Everything that is indented will run")
print("see!")
# elif statements only check if the previous statements failed!
elif a == b:
print("We also checked if they were the same. They are!")
elif a < 0:
print("a is a negative number and less than b")
# else statements capture everything that failed. They don't have a condition
else:
print("Everything failed. a must be greater than 0 and less than b")